A Voted Regularized Dual Averaging Method for Large-Scale Discriminative Training in Natural Language Processing

نویسندگان

Jianfeng Gao

Tianbing Xu

Lin Xiao

Xiaodong He

چکیده

We propose a new algorithm based on the dual averaging method for large-scale discriminative training in natural language processing (NLP), as an alternative to the perceptron algorithms or stochastic gradient descent (SGD). The new algorithm estimates parameters of linear models by minimizing L1 regularized objectives and are effective in obtaining sparse solutions, which is particularly desirable for large scale NLP tasks. We then give the mistake bound of the algorithm, and show how the bound is affected by the additional L1 regularization term. Evaluations on the tasks of parse reranking and statistical machine translation attest the success of the new algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Classification Using a Voted RDA Method

We propose a voted dual averaging method for online classification problems with explicit regularization. This method employs the update rule of the regularized dual averaging (RDA) method proposed by Xiao, but only on the subsequence of training examples where a classification error is made. We derive a bound on the number of mistakes made by this method on the training set, as well as its gen...

متن کامل

Iterative parameter mixing for distributed large-margin training of structured predictors for natural language processing

The development of distributed training strategies for statistical prediction functions is important for applications of machine learning, generally, and the development of distributed structured prediction training strategies is important for natural language processing (NLP), in particular. With ever-growing data sets this is, first, because, it is easier to increase computational capacity by...

متن کامل

Regularized Online Learning

Recently, Word2Vec tool has attracted a lot of interest for its promising performances in a variety of natural language processing (NLP) tasks. However, a critical issue is that the dense word representations learned in Word2Vec are lacking of interpretability. It is natural to ask if one could improve their interpretability while keeping their performances. Inspired by the success of sparse mo...

متن کامل

Sparse Word Embeddings Using ℓ1 Regularized Online Learning

متن کامل

Survey on Three Reranking Models for Discriminative Parsing

This survey is inspired by the so-called reranking techniques in natural language processing (NLP). The aim of this survey is to provide an overview of three main reranking tasks particularly for discriminative parsing. We will focus on the motivation for discriminative reranking, on the three models, boosting model, support vector machine (SVM) model and voted perceptron model, on the procedur...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

A Voted Regularized Dual Averaging Method for Large-Scale Discriminative Training in Natural Language Processing

نویسندگان

چکیده

منابع مشابه

Online Classification Using a Voted RDA Method

Iterative parameter mixing for distributed large-margin training of structured predictors for natural language processing

Regularized Online Learning

Sparse Word Embeddings Using ℓ1 Regularized Online Learning

Survey on Three Reranking Models for Discriminative Parsing

عنوان ژورنال:

اشتراک گذاری